<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic
  PUBLIC "-//OASIS//DTD DITA Composite//EN" "ditabase.dtd">
<topic id="topic1">
 <title>Developing a Background Worker Process</title>
 <body>
  <p> Greenplum Database can be extended to run user-supplied code in separate processes. Such
   processes are started, stopped, and monitored by <codeph>postgres</codeph>, which permits them to
   have a lifetime closely linked to the server's status. These processes have the option to attach
   to Greenplum Database's shared memory area and to connect to databases internally; they can also
   run multiple transactions serially, just like a regular client-connected server process. Also, by
   linking to <codeph>libpq</codeph> they can connect to the server and behave like a regular client
   application. </p>
  <note type="warning"> There are considerable robustness and security risks in using background
   worker processes because, being written in the <codeph>C</codeph> language, they have
   unrestricted access to data. Administrators wishing to enable modules that include background
   worker processes should exercise extreme caution. Only carefully audited modules should be
   permitted to run background worker processes.</note>
  <p> Background workers can be initialized at the time that Greenplum Database is started by
   including the module name in the <varname>shared_preload_libraries</varname> server configuration
   parameter. A module wishing to run a background worker can register it by calling
    <codeph>RegisterBackgroundWorker(BackgroundWorker *worker)</codeph> from its
    <codeph>_PG_init()</codeph>. Background workers can also be started after the system is up and
   running by calling the function <codeph>RegisterDynamicBackgroundWorker(BackgroundWorker *worker,
    BackgroundWorkerHandle **handle)</codeph>. Unlike <codeph>RegisterBackgroundWorker</codeph>,
   which can only be called from within the <codeph>postmaster</codeph>,
    <codeph>RegisterDynamicBackgroundWorker</codeph> must be called from a regular backend. </p>
  <p> The structure <codeph>BackgroundWorker</codeph> is defined thus:  </p>
  <codeblock>
typedef void (*bgworker_main_type)(Datum main_arg);
typedef struct BackgroundWorker
{
    char        bgw_name[BGW_MAXLEN];
    int         bgw_flags;
    BgWorkerStartTime bgw_start_time;
    int         bgw_restart_time;       /* in seconds, or BGW_NEVER_RESTART */
    bgworker_main_type bgw_main;
    char        bgw_library_name[BGW_MAXLEN];   /* only if bgw_main is NULL */
    char        bgw_function_name[BGW_MAXLEN];  /* only if bgw_main is NULL */
    Datum       bgw_main_arg;
    int         bgw_notify_pid;
} BackgroundWorker;
</codeblock>
  <p>
   <codeph>bgw_name</codeph> is a string to be used in log messages, process listings and similar
   contexts. </p>
  <p>
   <codeph>bgw_flags</codeph> is a bitwise-or'd bit mask indicating the capabilities that the module
   wants. Possible values are <codeph>BGWORKER_SHMEM_ACCESS</codeph> (requesting shared memory
   access) and <codeph>BGWORKER_BACKEND_DATABASE_CONNECTION</codeph> (requesting the ability to
   establish a database connection, through which it can later run transactions and queries). A
   background worker using <codeph>BGWORKER_BACKEND_DATABASE_CONNECTION</codeph> to connect to a
   database must also attach shared memory using <codeph>BGWORKER_SHMEM_ACCESS</codeph>, or worker
   start-up will fail. </p>
  <p>
   <codeph>bgw_start_time</codeph> is the server state during which <codeph>postgres</codeph> should
   start the process; it can be one of <codeph>BgWorkerStart_PostmasterStart</codeph> (start as soon
   as <codeph>postgres</codeph> itself has finished its own initialization; processes requesting
   this are not eligible for database connections), <codeph>BgWorkerStart_ConsistentState</codeph>
   (start as soon as a consistent state has been reached in a hot standby, allowing processes to
   connect to databases and run read-only queries), and
    <codeph>BgWorkerStart_RecoveryFinished</codeph> (start as soon as the system has entered normal
   read-write state). Note the last two values are equivalent in a server that's not a hot standby.
   Note that this setting only indicates when the processes are to be started; they do not stop when
   a different state is reached. </p>
  <p>
   <codeph>bgw_restart_time</codeph> is the interval, in seconds, that <codeph>postgres</codeph>
   should wait before restarting the process, in case it crashes. It can be any positive value, or
    <codeph>BGW_NEVER_RESTART</codeph>, indicating not to restart the process in case of a crash. </p>
  <p>
   <codeph>bgw_main</codeph> is a pointer to the function to run when the process is started. This
   function must take a single argument of type <codeph>Datum</codeph> and return
    <codeph>void</codeph>. <codeph>bgw_main_arg</codeph> will be passed to it as its only argument.
   Note that the global variable <codeph>MyBgworkerEntry</codeph> points to a copy of the
    <codeph>BackgroundWorker</codeph> structure passed at registration time.
    <codeph>bgw_main</codeph> may be NULL; in that case, <codeph>bgw_library_name</codeph> and
    <codeph>bgw_function_name</codeph> will be used to determine the entry point. This is useful for
   background workers launched after postmaster startup, where the postmaster does not have the
   requisite library loaded. </p>
  <p>
   <codeph>bgw_library_name</codeph> is the name of a library in which the initial entry point for
   the background worker should be sought. It is ignored unless <codeph>bgw_main</codeph> is NULL.
   But if <codeph>bgw_main</codeph> is NULL, then the named library will be dynamically loaded by
   the worker process and <codeph>bgw_function_name</codeph> will be used to identify the function
   to be called. </p>
  <p>
   <codeph>bgw_function_name</codeph> is the name of a function in a dynamically loaded library
   which should be used as the initial entry point for a new background worker. It is ignored unless
    <codeph>bgw_main</codeph> is NULL. </p>
  <p>
   <codeph>bgw_notify_pid</codeph> is the PID of a Greenplum Database backend process to which the
   postmaster should send <codeph>SIGUSR1</codeph> when the process is started or exits. It should
   be 0 for workers registered at postmaster startup time, or when the backend registering the
   worker does not wish to wait for the worker to start up. Otherwise, it should be initialized to
    <codeph>MyProcPid</codeph>. </p>
  <p>Once running, the process can connect to a database by calling
     <codeph>BackgroundWorkerInitializeConnection(<codeph>char *dbname</codeph>, <codeph>char
     *username</codeph>)</codeph>. This allows the process to run transactions and queries using the
    <codeph>SPI</codeph> interface. If <varname>dbname</varname> is NULL, the session is not
   connected to any particular database, but shared catalogs can be accessed. If
    <varname>username</varname> is NULL, the process will run as the superuser created during
    <codeph>initdb</codeph>. BackgroundWorkerInitializeConnection can only be called once per
   background process, it is not possible to switch databases. </p>
  <p> Signals are initially blocked when control reaches the <codeph>bgw_main</codeph> function, and
   must be unblocked by it; this is to allow the process to customize its signal handlers, if
   necessary. Signals can be unblocked in the new process by calling
    <codeph>BackgroundWorkerUnblockSignals</codeph> and blocked by calling
    <codeph>BackgroundWorkerBlockSignals</codeph>. </p>
  <p> If <codeph>bgw_restart_time</codeph> for a background worker is configured as
    <codeph>BGW_NEVER_RESTART</codeph>, or if it exits with an exit code of 0 or is terminated by
    <codeph>TerminateBackgroundWorker</codeph>, it will be automatically unregistered by the
   postmaster on exit. Otherwise, it will be restarted after the time period configured via
    <codeph>bgw_restart_time</codeph>, or immediately if the postmaster reinitializes the cluster
   due to a backend failure. Backends which need to suspend execution only temporarily should use an
   interruptible sleep rather than exiting; this can be achieved by calling
    <codeph>WaitLatch()</codeph>. Make sure the <codeph>WL_POSTMASTER_DEATH</codeph> flag is set
   when calling that function, and verify the return code for a prompt exit in the emergency case
   that <codeph>postgres</codeph> itself has terminated. </p>
  <p> When a background worker is registered using the
    <codeph>RegisterDynamicBackgroundWorker</codeph> function, it is possible for the backend
   performing the registration to obtain information regarding the status of the worker. Backends
   wishing to do this should pass the address of a <codeph>BackgroundWorkerHandle *</codeph> as the
   second argument to <codeph>RegisterDynamicBackgroundWorker</codeph>. If the worker is
   successfully registered, this pointer will be initialized with an opaque handle that can
   subsequently be passed to <codeph>GetBackgroundWorkerPid(<codeph>BackgroundWorkerHandle
     *</codeph>, <codeph>pid_t *</codeph>)</codeph> or
     <codeph>TerminateBackgroundWorker(<codeph>BackgroundWorkerHandle *</codeph>)</codeph>.
    <codeph>GetBackgroundWorkerPid</codeph> can be used to poll the status of the worker: a return
   value of <codeph>BGWH_NOT_YET_STARTED</codeph> indicates that the worker has not yet been started
   by the postmaster; <codeph>BGWH_STOPPED</codeph> indicates that it has been started but is no
   longer running; and <codeph>BGWH_STARTED</codeph> indicates that it is currently running. In this
   last case, the PID will also be returned via the second argument.
    <codeph>TerminateBackgroundWorker</codeph> causes the postmaster to send
    <codeph>SIGTERM</codeph> to the worker if it is running, and to unregister it as soon as it is
   not. </p>
  <p> In some cases, a process which registers a background worker may wish to wait for the worker
   to start up. This can be accomplished by initializing <codeph>bgw_notify_pid</codeph> to
    <codeph>MyProcPid</codeph> and then passing the <codeph>BackgroundWorkerHandle *</codeph>
   obtained at registration time to
     <codeph>WaitForBackgroundWorkerStartup(<codeph>BackgroundWorkerHandle *handle</codeph>,
     <codeph>pid_t *</codeph>)</codeph> function. This function will block until the postmaster has
   attempted to start the background worker, or until the postmaster dies. If the background runner
   is running, the return value will <codeph>BGWH_STARTED</codeph>, and the PID will be written to
   the provided address. Otherwise, the return value will be <codeph>BGWH_STOPPED</codeph> or
    <codeph>BGWH_POSTMASTER_DIED</codeph>. </p>
  <p> The <codeph>worker_spi</codeph> contrib module contains a working example, which demonstrates
   some useful techniques. </p>
  <p> The maximum number of registered background workers is limited by max-worker-processes. </p>
 </body>
</topic>
